23 research outputs found

    Detecting click fraud in online advertising: A data mining approach

    Get PDF
    National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding InitiativeSubmit request for dataset at https://larc.smu.edu.sg/buzzcity-mobile-advertisement-dataset</p

    On the communal analysis suspicion scoring for identity crime in streaming credit applications

    Full text link
    This paper describes a rapid technique: communal analysis suspicion scoring (CASS), for generating numeric suspicion scores on streaming credit applications based on implicit links to each other, over both time and space. CASS includes pair-wise communal scoring of identifier attributes for applications, definition of categories of suspiciousness for application-pairs, the incorporation of temporal and spatial weights, and smoothed k-wise scoring of multiple linked application-pairs. Results on mining several hundred thousand real credit applications demonstrate that CASS reduces false alarm rates while maintaining reasonable hit rates. CASS is scalable for this large data sample, and can rapidly detect early symptoms of identity crime. In addition, new insights have been observed from the relationships between applications.<br /

    Adaptive spike detection for resilient data stream mining

    Full text link
    Automated adversarial detection systems can fail when under attack by adversaries. As part of a resilient data stream mining system to reduce the possibility of such failure, adaptive spike detection is attribute ranking and selection without class-labels. The first part of adaptive spike detection requires weighing all attributes for spiky-ness to rank them. The second part involves filtering some attributes with extreme weights to choose the best ones for computing each example&rsquo;s suspicion score. Within an identity crime detection domain, adaptive spike detection is validated on a few million real credit applications with adversarial activity. The results are F-measure curves on eleven experiments and relative weights discussion on the best experiment. The results reinforce adaptive spike detection&rsquo;s effectiveness for class-label-free attribute ranking and selection.<br /

    Utility of real-time decision-making in commercial data stream mining domains

    Full text link
    The objective is to measure utility of real-time commercial decision making. It is important due to a higher possibility of mistakes in real-time decisions, problems with recording actual occurrences, and significant costs associated with predictions produced by algorithms. The first contribution is to use overall utility and represent individual utility with a monetary value instead of a prediction. The second is to calculate the benefit from predictions using the utility-based decision threshold. The third is to incorporate cost of predictions. For experiments, overall utility is used to evaluate communal and spike detection, and their adaptive versions. The overall utility results show that with fewer alerts, communal detection is better than spike detection. With more alerts, adaptive communal and spike detection are better than their static versions. To maximise overall utility with all algorithms, only 1% to 4% in the highest predictions should be alerts.<br /

    Adaptive communal detection in search of adversarial identity crime

    Full text link
    This paper is on adaptive real-time searching of credit application data streams for identity crime with many search parameters. Specifically, we concentrated on handling our domain-specific adversarial activity problem with the adaptive Communal Analysis Suspicion Scoring (CASS) algorithm. CASS\u27s main novel theoretical contribution is in the formulation of State-of- Alert (SoA) which sets the condition of reduced, same, or heightened watchfulness; and Parameter-of-Change (PoC) which improves detection ability with pre-defined parameter values for each SoA. With pre-configured SoA policy and PoC strategy, CASS determines when, what, and how much to adapt its search parameters to ongoing adversarial activity. The above approach is validated with three sets of experiments, where each experiment is conducted on several million real credit applications and measured with three appropriate performance metrics. Significant improvements are achieved over previous work, with the discovery of some practical insights of adaptivity into our domain.<br /

    On the communal analysis suspicion scoring for identity crime in streaming credit applications

    No full text
    This paper describes a rapid technique: communal analysis suspicion scoring (CASS), for generating numeric suspicion scores on streaming credit applications based on implicit links to each other, over both time and space. CASS includes pair-wise communal scoring of identifier attributes for applications, definition of categories of suspiciousness for application-pairs, the incorporation of temporal and spatial weights, and smoothed k-wise scoring of multiple linked application-pairs. Results on mining several hundred thousand real credit applications demonstrate that CASS reduces false alarm rates while maintaining reasonable hit rates. CASS is scalable for this large data sample, and can rapidly detect early symptoms of identity crime. In addition, new insights have been observed from the relationships between applications.Risk analysis Credit application fraud detection Communal scoring Multi-attribute directed graph Dynamic application data streams Anomaly detection
    corecore